library(tidyverse)
library(collapsibleTree)
library(pg13)
library(chariot)
library(amphora)
conn <- connectAthena()

There are drugs that are used to treat both cancerous and non-cancerous conditions. Therefore, it cannot be reasonably assumed that the patient is a cancer patient if these false positive drugs are present in the drug exposure record. Below is a curated list of the RxNorm Ingredients that qualify as false positive drugs according to feedback from a panel of Oncologists at OHDSI.

Reverse engineering the false positives will allows for identification of the OMOP Concept Classes in ATC and/or HemOnc that contain the false positive drugs that can then be excluded from analyses. This entails finding the immediate parent class of each of the RxNorm Ingredient identified as a false positive drug using relationships with a maximum level of separation of 1 in the Concept Ancestor table.

false_positive_ancestors <- join_for_ancestors(data = false_positive_concepts, 
    descendant_id_column = "concept_id", where_in_concept_ancestor_field = "max_levels_of_separation", 
    where_in_concept_ancestor_field_value = 1, conn = conn)

Additional filters are applied to the parents of the false positive drugs:
1. OMOP Classes
1. ATC and HemOnc Vocabularies because they are the target classifications
1. Concept Classes not equal to ATC 5th because it has a lateral relationship to RxNorm Ingredient Concepts

false_positive_parents <- false_positive_ancestors %>% rename_all(str_replace_all, 
    "ancestor_", "parent_") %>% filter(parent_vocabulary_id %in% 
    c("ATC", "HemOnc"), parent_standard_concept %in% c("C")) %>% 
    filter(!(parent_concept_class_id %in% c("ATC 5th")))
rMarkedDown::print_dt(false_positive_parents)

The classifications for ATC and HemOnc are separated from each other to analyze separately, starting with ATC to return ATC Classes that contain false positive drugs. A color column is added to direct fill in the plot.

atc <- false_positive_parents %>% rubix::split_by(col = parent_vocabulary_id) %>% 
    pluck("ATC") %>% unite(col = false_positive_atc_class, parent_concept_id, 
    parent_concept_name, sep = " ") %>% transmute(false_positive_atc_class, 
    color = "red")
rMarkedDown::print_dt(atc)

To visualize these ATC Classes, the class hierarchy of the ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS topmost ATC class is derived.

atc_descendants <- join_for_descendants(data = tibble::tibble(top_atc_class = 21601386)) %>% 
    filter(descendant_vocabulary_id %in% "ATC") %>% filter(!(descendant_concept_class_id %in% 
    "ATC 5th")) %>% arrange(descendant_concept_class_id) %>% 
    unite(col = descendant, descendant_concept_id, descendant_concept_name, 
        sep = " ", remove = FALSE) %>% pivot_wider(id_cols = top_atc_class, 
    names_from = descendant_concept_class_id, values_from = descendant, 
    values_fn = list(descendant = ~paste(unique(.), collapse = "|"))) %>% 
    rubix::recode_value(col = top_atc_class, `21601386 ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS` = "21601386")

The dataframe of the ATC class hierarchy for ANTINEOPLASTIC AND IMMUNOMODULATING AGENTS is depivoted into a parent-child data tree structure. A color field is added to plot all classes in blue unless it is a class containing a false positive drug, in which case it will be plotted in red.

parent_child_fields <- list(c("top_atc_class", "ATC 1st"), c("ATC 1st", 
    "ATC 2nd"), c("ATC 2nd", "ATC 3rd"), c("ATC 3rd", "ATC 4th"))
output <- list()
for (i in seq_along(parent_child_fields)) {
    x <- atc_descendants %>% select(all_of(parent_child_fields[[i]]))
    colnames(x) <- c("parent", "child")
    output[[i]] <- x
}
atc_data_tree <- bind_rows(output) %>% separate_rows(parent, 
    sep = "[|]{1}") %>% separate_rows(child, sep = "[|]{1}") %>% 
    mutate(color = "blue") %>% distinct()
rMarkedDown::print_dt(atc_data_tree)

The ATC classes containing the false positive drugs is joined to the data tree to denote them in red.

atc_data_tree2 <- atc_data_tree %>% left_join(atc, by = c(child = "false_positive_atc_class"), 
    suffix = c(".dt", ".fp")) %>% transmute(parent, child, color = coalesce(color.fp, 
    color.dt))
rMarkedDown::print_dt(atc_data_tree2)

The root node is converted to NA_character_ as a requirement for the collapsibleTreeNetwork function in the collapsibleTree package.

atc_data_tree2$parent[1] <- NA_character_
rMarkedDown::print_dt(atc_data_tree2)

ATC Plot